Arrays from C

Creation and Initialization

  • Syntax:

    type name [elements];
    
    • The elements  field within square brackets [], representing the number of elements in the array, must be a constant expression, since arrays are blocks of static memory whose size must be determined at compile time, before the program runs.

  • Examples:

int foo[5];         // array with 5 elements, not initialized (undefined values).
int baz[5] = { };   // array with 5 elements, zeroed initialized.
int foo[5] = { 16, 2, 77, 40, 12071 }; // array with 5 elements initialized.
int foo[]  = { 16, 2, 77, 40, 12071 }; // array with elements=5 elements initialized, with size inferred by the initialization.
int foo[]  { 16, 2, 77, 40, 12071 };   // array with 5 elements initialized, using universal initialization (equivalent to the above).
  • There is no longer need for the equal sign between the declaration and the initializer. Both these statements are equivalent.

Multidimensional Arrays

char century [100][365][24][60][60];
  • Multidimensional arrays are just an abstraction for programmers , since the same results can be achieved with a simple array, by multiplying its indices:

int jimmy [3][5];   // is equivalent to
int jimmy [15];     // (3 * 5 = 15)  
  • With the only difference that with multidimensional arrays, the compiler automatically remembers the depth of each imaginary dimension. The following two pieces of code produce the exact same result, but one uses a bidimensional array while the other uses a simple array

  • Both codes below are equivalent:

    #define WIDTH  5
    #define HEIGHT 3
    
    int jimmy [HEIGHT][WIDTH];
    int n,m;
    
    int main() {
        for (n=0; n < HEIGHT; n++)
            for (m=0; m < WIDTH; m++) {
                jimmy[n][m]=(n+1)*(m+1);
            }
    }
    
    #define WIDTH  5
    #define HEIGHT 3
    
    int jimmy [HEIGHT * WIDTH];
    int n,m;
    
    int main () {
        for (n=0; n < HEIGHT; n++)
            for (m=0; m < WIDTH; m++) {
                jimmy[n*WIDTH+m]=(n+1)*(m+1);
            }
    }
    

T*  or T foo[]

  • T*

    char* p = "hello";       // This is allowed but deprecated in older versions; comptime error in C++11 and forward.
    const char* p = "hello"; // Correct version, to not have UBs.
    
    • Allows reassignment

    • Allows dynamic allocation

    • Is natural for function parameters

    • Matches C string APIs

    • Can reference string literals without copying

  • T foo[] :

    char p[] = "hello";
    
    • Size is known

    • Storage should be owned locally

    • No reassignment is needed

  • They are not interchangeable; they serve different purposes.

Access

int foo[5];   // declaration of a new array
foo[2] = 75;  // access to an element of the array. 
  • It is syntactically correct to exceed the valid range of indices for an array. This can create problems, since accessing out-of-range elements do not cause errors on compilation, but can cause errors on runtime.

  • This is equivalent:

*(foo+4)
foo[4]
Discussion
  • dmitsuki:

    • All an array is, is if you have a starting address 0, and an int is 4 bytes wide, if you index into the array for the second position, start at the first position and go 4 bytes (4 * 1)

    • To get the third position it would be 4 * 2, etc. That's all an "array" is: a pointer, and then you do math to its address

  • Caio:

    • That sounds so weird, it seems like I'm just scrolling through memory without much consideration for typing or any boundary checking. Idk, maybe this is the way it is in the end, but sounds odd to me at first.

    • So you can happily store beyond the bounds of an array because to the CPU it's just another pointer address. Bounds checking and such is enforced at a higher level (i.e., the language that generated the instructions)

  • Lee:

    • Rest assured it's not intuitive to anyone. If things like this were intuitive, the industry and humanity would look very  different.

  • Barinzaya:

    • That's C in a nutshell

    • It is  what's actually happening at the CPU level. The CPU doesn't know about arrays, structures, etc. It just deals in "load a value from/store a value to this address". It doesn't know nor care about the larger structure that that value may be a part of.

  • dmitsuki:

    • That's why people made new languages with new features

    • But fundamentally that is what is happening

    • It's also why certain security problems exist, you can read past the bounds of an array and read memory you shouldn't

Reassigning

  • In C++, built-in arrays are not assignable. After an array is created, you cannot assign a new initializer list to it.

  • Arrays are not first-class assignable objects.

    • Arrays cannot be assigned

    • Arrays cannot be copy-assigned

    • Arrays cannot be returned by value

    • Arrays cannot be passed by value

  • They are objects, but they lack assignment semantics.

  • This is a major reason people use pointers.

  • This is not valid :

    int foo[5];
    int bar[] = {1,2,3};
    
    
    foo[] = { 16, 2, 77, 40, 12071 };
    foo   = { 16, 2, 77, 40, 12071 }
    foo   = bar;
    
    • Reasons :

      • foo[]  without a size is only allowed at declaration.

      • Arrays do not support operator= .

      • Decay to pointers in most expressions

      • Have no copy assignment operator

      • Are fixed-size objects

  • This is valid :

    • Element-wise assignment is valid.

    foo[0] = 16; 
    foo[1] = 2; 
    foo[2] = 77; 
    foo[3] = 40; 
    foo[4] = 12071; 
    

Passing an array as a function parameter

  • In C++, it is not possible to pass the entire block of memory represented by an array to a function directly as an argument. But what can be passed instead is its address. In practice, this has almost the same effect, and it is a much faster and more efficient operation.

  • To accept an array as parameter for a function, the parameters can be declared as the array type, but with empty brackets, omitting the actual size of the array.

    void procedure (int arg[])
    
    #include <iostream>
    
    void print_array(int arg[], int length) {
        for (int n=0; n<length; ++n)
            std::cout << arg[n] << ' ';
        std::cout << '\n';
    }
    
    int main() {
    int first_array[]  = {5, 10, 15};
    int second_array[] = {2, 4, 6, 8, 10};
    print_array(first_array,3);
    print_array(second_array,5);
    }
    
    • The first parameter (int arg[]) accepts any array whose elements are of type int, whatever its length. For that reason, we have included a second parameter that tells the function the length of each array that we pass to it as its first parameter. This allows the for loop that prints out the array to know the range to iterate in the array passed, without going out of range.

  • In a way, passing an array as argument always loses a dimension. The reason behind is that, for historical reasons, arrays cannot be directly copied, and thus what is really passed is a pointer.

  • In both C and C++:

    void foo(char s[]);
    // is the same as
    void foo(char* s);
    
  • Arrays decay to pointers in parameter lists, so many APIs naturally use char* .

For Multidimensional arrays

  • The format for a tridimensional array parameter is:

base_type[][depth][depth]
void procedure (int my_array[][3][4])
  • Notice that the first brackets [] are left empty, while the following ones specify sizes for their respective dimensions. This is necessary in order for the compiler to be able to determine the depth of each additional dimension.